Refining Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results

نویسنده

  • Marti A. Hearst
چکیده

Knowledge-poor corpus-based approaches to natural language processing are attractive in that they do not incur the difficulties associated with complex knowledge bases and real-world inferences. However, these kinds of language processing techniques in isolation often do not suffice for a particular task; for this reason we are interested in finding ways to combine various techniques and improve their results. Accordingly, we conducted experiments to refine the results of an automatic lexical discovery technique by making use of a statistically-based syntactic similarity measure. The discovery program uses lexico-syntactic patterns to find instances of the hyponymy relation in large text bases. Once relations of this sort are found, they should be inserted into an existing lexicon or thesaurus. However, the terms in the relation may have multiple senses, thus hampering automatic placement. In order to address this problem we applied a termsimilarity determination technique to the problem of choosing where, in an existing lexical hierarchy, to install a lexical relation. The union of these two corpus-based methods is promising, although only partially successful in the experiments run so far. Here we report some preliminary results, and make suggestions for how to improve the technique in future.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Method for Re ning Automatically-Discovered Lexical Relations: Combining Weak Techniques for Stronger Results

Knowledge-poor corpus-based approaches to natural language processing are attractive in that they do not incur the diiculties associated with complex knowledge bases and real-world inferences. However, these kinds of language processing techniques in isolation often do not suuce for a particular task; for this reason we are interested in nding ways to combine various techniques and improve thei...

متن کامل

Automatic Discovery of Semantic Relations using MindNet

Information extraction deals with extracting entities (such as people,organizations or locations) and named relations between entities (such as "People born-in Country") from text documents. An important challenge in information extraction is the labeling of training data which is usually done manually and is therefore very laborious and in certain cases impractical. This paper introduces a new...

متن کامل

Building and Refining Rhetorical-Semantic Relation Models

We report results of experiments which build and refine models of rhetoricalsemantic relations such as Cause and Contrast. We adopt the approach of Marcu and Echihabi (2002), using a small set of patterns to build relation models, and extend their work by refining the training and classification process using parameter optimization, topic segmentation and syntactic parsing. Using human-annotate...

متن کامل

Extraction, Evaluation and Integration of Lexical-Semantic Relations for the Automated Construction of a Lexical Ontology

Several approaches for extracting semantic relations from various types of resources have been proposed during the last years. While already of great value when used separately, combining these techniques promises to lead to even broader and more reliable results. However, divergent information may occur when assembling such data. We present LexO, a framework for integrating semantic relations ...

متن کامل

Combining pattern recognition and deep-learning-based algorithms to automatically detect commercial quadcopters using audio signals (Research Article)

Commercial quadcopters with many private, commercial, and public sector applications are a rapidly advancing technology. Currently, there is no guarantee to facilitate the safe operation of these devices in the community. Three different automatic commercial quadcopters identification methods are presented in this paper. Among these three techniques, two are based on deep neural networks in whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992